Introducing MegaHAL
نویسندگان
چکیده
Conversation simulators are computer programs which give the appearance of conversing with a user in natural language. Alan Turing devised a simple test in order to decide whether such programs are intelligent. In 1991, the Cambridge Centre for Behavioural Studies held the first formal instantiation of the Turing Test. In this incarnation the test was known as the Loebner contest, as Dr. Hugh Loebner pledged a $100,000 grand prize for the first computer program'to pass the test. In this paper we give a brief background to the contest, before describing in detail the workings of MegaHAL, the primary author's entry to the 1998 Loebner contest. 1 I n t r o d u c t i o n Alan Turing was a brilliant British mathematician who played a great role in the development of the computer. The imitation game, nowadays known as the Turing test, was devised by Turing as a method for deciding whether or not a computer program is intelligent. The Turing test takes place between an interrogator and two subjects. The interrogator communicates with these subjects via a computer terminal, and must decide which is a human being and which is a computer program. The human being helps the interrogator to make the correct identification, while the computer program attempts to trick the interrogator into making the wrong identification. If the latter case occurs, the computer program is said to be exhibiting intelligence (Turing, 1992). One of the great advantages of the Turing test is that it allows the interrogator to evaluate almost all of the evidence that we would assume to constitute thinking (Moor, 1976). For instance, the interrogator can pose hypothetical situations in order to ask the subjects how they would react. Alan Turing died in 1954, a decade before conversation simulators such as ELIZA emerged. It is indeed unfortunate that he did not live to witness his test being performed. One cannot help but think that he would have been disappointed. 2 T h e L o e b n e r C o n t e s t Apart from a few limited tests performed by programmers of conversation simulators (Colby, 1981), the Turing test was not formally conducted until 1995. Although the inaugural Loebner contest, held in 1991, was touted as the first formal instantiation of the Turing test, it was not until 1995 that it truly satisfied Turing's original specifications (Hutchens, 1996). The first Loebner contest was held on the 8 th of November 1991 in Boston's Computer Museum. Because this was a contest rather than an experiment, six computer programs were accepted as subjects. Four human subjects and ten judges were selected from respondents to a newspaper advertisement; none of them had any special expertise in Computer Science (Epstein, 1992). The original Turing test involved a binary decision between two subjects by a single judge. With ten subjects and ten judges, the situation was somewhat more complex. After months of deliberation, £he prize committee developed a suitable scoring mechanism. Each judge was required to rank the subjects from least human-like to most human-like, and to mark the point at which they believed the subjects switched from computer programs to human beings. If the median rank of a computer program exceeded the median rank of at least one of the human subjects, then that computer program would win the grand prize of $100,000.1 If there was no grand prize winner, the computer program with the highest median rank would win the contest with a prize of $2000. 1Today the program must also satisfy audio-visual requirements to win the grand prize. Hutchens and Alder 271 Introducing MegaHal Jason L. Hutchens and Michael D. Alder (1998) Introducing MegaHal. In D.M.W. Powers (ed.) NeMLaP3/CoNLL98 Workshop on Human Computer Conversation, ACL, pp 271-274. 3 C o n v e r s a t i o n S i m u l a t o r s Since its inception, the Loebner contest has primarily attracted hobbyist entries which simulate conversation using template matching; a method employed by Joseph Weizenbaum in his ELIZA conversation simulator, developed at MIT between 1964 and 1966. Put simply, these programs look for certain patterns of words in the user's input, and reply with a predetermined output, which may contain blanks to be filled in with details such as the user's name. Such programs are effective because they exploit the fact that human beings tend to read much more meaning into what is said than is actually there; we are fooled into reading structure into chaos, and we interpret non-sequitur as whimsical conversation (Shieber, 1994). Weizenbaum was shocked at the reaction to ELIZA. He noticed three main phenomenon which disturbed him greatly (Weizenbaum, 1976): i. A number of practising psychiatrists believed that ELIZA could grow into an almost completely automatic form of psychotherapy. 2. Users very quickly became emotionally involved--Weizenbaum's secretary demanded to be left alone with the program, for example. 3. Some people believed that the program demonstrated a general solution to the problem of computer understanding of natural language. Over three decades have passed since ELIZA was created. Computers have become significantly more powerful, while storage space and memory size have increased exponentially. Yet, at least as far as the entrants of the Loebner contest go, the capabilities of conversation simulators have remained exactly where they were thirty years ago. Indeed, judges in the 1991 contest said that they felt let down after talking to the computer entrants, as they had had their expectations raised when using ELIZA during the selection process.